A Query-by-Similarity Indexing Strategy for Web Forms

نویسندگان

  • Willian Ventura Koerich
  • Ronaldo dos Santos Mello
چکیده

Search engines do not provide speci c searches for Web forms related to the Deep Web, in particular, similarity search. To deal with this lack, we propose a query-by-similarity system called WF-Sim, and this paper presents the indexing strategy adopted by WF-Sim for querying-by-similarity Web forms. It is centered on suitable index structures to the main kinds of queries posed on Web forms, as well as some optimizations in order to reduce the number of index entries. To evaluate the indexes performance, we ran experiments on twoWF-Sim persistence strategies: le system and database. We also compare the performance of our indexes against the traditional keyword-based index, and the results were promising.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K-List Similarity Search

We consider the problem of processing similarity queries over a set of top-k rankings where the query ranking and the similarity threshold are provided at query time. Spearman’s Footrule distance is used to compute the similarity between rankings, considering how well rankings agree on the positions (ranks) of ranked items (i.e., the L1 distance). This setup allows the application of metric ind...

متن کامل

تأملاتی بر نمایه‌ سازی تصاویر: یک تصویر ارزشی برابر با هزار واژه

Purpose: This paper presents various  image indexing techniques and discusses their advantages and limitations.             Methodology: conducting a review of the literature review, it identifies three main image indexing techniques, namely concept-based image indexing, content-based image indexing and folksonomy. It then describes each technique. Findings: Concept-based image indexing is te...

متن کامل

Math Indexer and Searcher under the Hood: History and Development of a Winning Strategy

This paper describes and summarizes experience of Masaryk University Math Information Retrieval team (MIRMU) with the mathematical search developed and performed for the NTCIR-11 Math-2 Task. Our approach is the similarity search based on canonicalized MathML and second generation of scalable full text search engine Math Indexer and Searcher (MIaS) with attested state-of-the-art information ret...

متن کامل

Query-Driven Indexing in Large-Scale Distributed Systems

Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Web search engines such as Google and Yahoo! already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the...

متن کامل

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013